Applying Online Search Techniques to Continuous-State Reinforcement Learning
نویسندگان
چکیده
In this paper, we describe methods for efficiently computing better solutions to control problems in continuous state spaces. We provide algorithms that exploit online search to boost the power of very approximate value functions discovered by traditional reinforcement learning techniques. We examine local searches, where the agent performs a finite-depth lookahead search, and global searches, where the agent performs a search for a trajectory all the way from the current state to a goal state. The key to the success of the local methods lies in taking a value function, which gives a rough solution to the hard problem of finding good trajectories from every single state, and combining that with online search, which then gives an accurate solution to the easier problem of finding a good trajectory specifically from the current state. The key to the success of the global methods lies in using aggressive state-space search techniques such as uniform-cost search and A*, tamed into a tractable form by exploiting neighborhood relations and trajectory constraints that arise from continuous-space dynamic control.
منابع مشابه
Applying Online Search Techniques to Reinforcement Learning
In reinforcement learning it is frequently necessary to resort to an approximation to the true optimal value function. Here we investigate the bene ts of online search in such cases. We examine \local" searches, where the agent performs a nite-depth lookahead search, and \global" searches, where the agent performs a search for a trajectory all the way from the current state to a goal state. The...
متن کاملReinforcement Learning In Real-Time Strategy Games
We consider the problem of effective and automated decisionmaking in modern real-time strategy (RTS) games through the use of reinforcement learning techniques. RTS games constitute environments with large, high-dimensional and continuous state and action spaces with temporally-extended actions. To operate under such environments we propose Exlos, a stable, model-based MonteCarlo method. Contra...
متن کاملTree-Based Policy Learning in Continuous Domains through Teaching by Demonstration
This paper addresses the problem of reinforcement learning in continuous domains through teaching by demonstration. Our approach is based on the Continuous U-Tree algorithm, which generates a tree-based discretization of a continuous state space while applying general reinforcement learning techniques. We introduce a method for generating a preliminary state discretization and policy from exper...
متن کاملReinforcement Learning with Particle Swarm Optimization Policy (PSO-P) in Continuous State and Action Spaces
This article introduces a model-based reinforcement learning (RL) approach for continuous state and action spaces. While most RL methods try to find closed-form policies, the approach taken here employs numerical on-line optimization of control action sequences. First, a general method for reformulating RL problems as optimization tasks is provided. Subsequently, Particle Swarm Optimization (PS...
متن کاملReinforcement Learning in Neural Networks: A Survey
In recent years, researches on reinforcement learning (RL) have focused on bridging the gap between adaptive optimal control and bio-inspired learning techniques. Neural network reinforcement learning (NNRL) is among the most popular algorithms in the RL framework. The advantage of using neural networks enables the RL to search for optimal policies more efficiently in several real-life applicat...
متن کامل